Feb 28

1: Alfa Angle Calculation and Visualization

First step: reading in the data file

library(here)
## here() starts at /Users/josephdekel/git_projects/fish-reflectance
here()
## [1] "/Users/josephdekel/git_projects/fish-reflectance"
here("Data", "varied_spectra.csv", "pablo_equation_1_transposed.csv" )
## [1] "/Users/josephdekel/git_projects/fish-reflectance/Data/varied_spectra.csv/pablo_equation_1_transposed.csv"
df <- read.csv("Data/varied_spectra.csv")

I first created synthetic data that would effectively demonstrate how different patches on fish can be distinguished using hyperspectral measurements. The varied_spectra.csv file was designed to contain five measurements per patch type, simulating natural variation while maintaining distinct spectral signatures and characteristic spectral patterns that mirror biological reality.

Function to calculate the spectral angle:

# Calculating spectral angle between two spectra (x and y)
calc_spectral_angle <- function(x, y) 
{
  sum_xy <- sum(x * y)    
  sum_x2 <- sum(x^2)      
  sum_y2 <- sum(y^2)
  alfa <- acos(sum_xy / sqrt(sum_x2 * sum_y2))
  return(alfa)  # Returning angle in radians
}

The code implementation centres around the spectral angle calculation function, which implements the formula from the paper: α = cos⁻¹(ΣXY/√(Σ(X)²Σ(Y)²)). This calculation is crucial for identifying the most representative measurement for each patch type by finding the measurement closest to the mean. The function returns angles in radians, providing a standardized measure of similarity between spectra.

First Plot:

# Creating first plot (initialized with first measurement)
plot(df[,1], df[,2],       # Plotting wavelength (col 1) vs first measurement (col 2) 
     type="l",             # Line plot
     ylim=c(0,100),        # Y-axis from 0 to 100%
     xlab="Wavelength (nm)", 
     ylab="Reflectance (%)",
     main="Patch Measurements and Means")

# Defining colors and patch names for plotting
colors <- c("black", "blue", "orange", "gray")    # Colour for each patch
patches <- c("black", "blue", "orange", "white")  # Names of patches

# Ploting all individual measurements for each patch
for(patch in patches) 
{
  patch_cols <- grep(patch, names(df))         # Finding all columns for this patch
  for(col in patch_cols)                       # Loop through each column of this patch
    {
    lines(df[,1], df[,col],                    # Adding line for each measurement
          col=colors[which(patches == patch)]) # Using corresponding colour
  }
}

# Calculating & plotting mean spectra
means <- data.frame(wavelength = df[,1])        # Creating dataframe for means
for(patch in patches) 
{
  patch_cols <- grep(patch, names(df))          # Finding columns for this patch
  means[[patch]] <- rowMeans(df[,patch_cols])   # Calculating mean spectrum
  lines(df[,1], means[[patch]],                 # Plotting mean as thicker line
        col=colors[which(patches == patch)], 
        lwd=3)                                  
}

# Legend for first plot
legend("topleft", patches, col=colors, lwd=2)

The plot displays all measurements and means, with individual measurements shown as thin lines and means as thicker lines in corresponding colours. This visualization demonstrates the variation within patches while maintaining clear distinction between patch types.

Finding representative spectrum (smallest α) for each patch:

# Finding representative spectrum (smallest α) for each patch
rep_spectra <- data.frame(wavelength = df[,1])  # Creating dataframe for representative spectra (starting with wavelengths)
for(patch in patches) 
{
  patch_cols <- grep(patch, names(df))          # Finding all columns for this patch
  mean_spec <- means[[patch]]                   # Getting mean spectrum we calculated earlier
  angles <- numeric(length(patch_cols))         # Creating empty vector to store angles
  
  # Calculating spectral angle between each measurement and mean
  for(i in seq_along(patch_cols)) 
  {
    angles[i] <- calc_spectral_angle(df[,patch_cols[i]], mean_spec)
  }
  
  # Selecting measurement with smallest angle as representative
  rep_spectra[[patch]] <- df[,patch_cols[which.min(angles)]]
  
  # Printing the smallest angle found
  cat(sprintf("%s patch - smallest angle: %f\n", patch, min(angles)))
}
## black patch - smallest angle: 0.005026
## blue patch - smallest angle: 0.001958
## orange patch - smallest angle: 0.000920
## white patch - smallest angle: 0.000547

Second Plot:

# Creating second plot - representative spectra
plot(rep_spectra$wavelength, rep_spectra$black,   # Initialize with black patch
     type="l", 
     ylim=c(0,100),
     xlab="Wavelength (nm)", 
     ylab="Reflectance (%)",
     main="Representative Spectra (Smallest α)",
     col="black")

# Adding lines for other patches
lines(rep_spectra$wavelength, rep_spectra$blue, col="blue")
lines(rep_spectra$wavelength, rep_spectra$orange, col="orange")
lines(rep_spectra$wavelength, rep_spectra$white, col="gray")

# Legend for second plot
legend("topleft", patches, col=colors, lwd=1)

This plot shows only the representative spectra (those closest to the mean), providing a cleaner visualization for quick comparison between patches.

The data processing flow moves systematically from reading the CSV file through plotting all measurements, calculating means, finding representative spectra using spectral angles, and finally creating the comparative plots. The implementation uses nested loops for efficiency, with the outer loop processing each patch type and the inner loop handling individual measurements. The grep function identifies relevant columns for each patch type. In the code, the wavelength column positioned first for easy reference.

Colour choices in the visualization match the actual patch colours for intuitive interpretation, with line thickness differentiating means from individual measurements. The y-axis range of 0-100% covers the full possible reflectance range, while the wavelength range matches the visible spectrum. This approach provides a practical demonstration of the spectral analysis methods described in the Piranha paper, allowing both examination of variation within patches and comparison between different patch types.

2: Patches Variation Analysis

Trying to explore the methods of using Alfa Angle as a tool for comparing reflectance spectra (as seen in the piranha paper https://www.nature.com/articles/s41598-021-95713-0).

This code analyzes Trigger Fish patch spectral data to determine:

  1. Whether we can tell different patch types apart (black, white, orange, blue)

  2. How much variation exists within patches vs. across individuals

Reading in data

# =============================================================================
# Patches Variation Analysis 
# This code analyzes Trigger Fish patch spectral data to determine:
# 1. Whether we can tell different patch types apart (black, white, orange, blue)
# 2. How much variation exists within patches vs. across individuals
# =============================================================================

df <- read.csv("Data/pablo_equation_1_transposed.csv")

Defining the spectral angle calculation function

# This function calculates the angular difference between two spectral signatures
# Based on the formula from the piranha paper: α = cos⁻¹(ΣXY/√(Σ(X)²Σ(Y)²))
# Smaller angles indicate greater similarity between spectra
calc_spectral_angle <- function(x, y) 
{
  sum_xy <- sum(x * y)     
  sum_x2 <- sum(x^2)       
  sum_y2 <- sum(y^2)       
  alfa <- acos(sum_xy / sqrt(sum_x2 * sum_y2))
  
  return(alfa)  # Return angle in radians
}

This function calculates the angular difference between two spectral signatures. Based on the formula from the piranha paper: α = cos⁻¹(ΣXY/√(Σ(X)²Σ(Y)²)). Smaller angles indicate greater similarity between spectra.

Identifying patch types and individuals for analysis

# Defining our patch types (colors) and individual identifiers (repeats)
patch_types <- c("black", "white", "orange", "blue")  # The four patch types
individuals <- c("01", "02", "03", "04")              # The four individuals (repeats)

# =============================================================================
# Organizing the data by patch type and individual
# =============================================================================
# Creating a nested list structure to easily access data:
# patch_columns$black$01 will contain the spectral data for black patch on individual 01
patch_columns <- list()
for(patch in patch_types) 
{
  # Creating a sub-list for each patch type
  patch_columns[[patch]] <- list()
  
  # For each individual, finding and storing the corresponding column data
  for(ind in individuals) 
  {
    col_name <- paste0(patch, "_", ind)  # Constructing column name (e.g., "black_01")
    
    # Only adding data if the column exists in our dataset
    if(col_name %in% names(df)) 
    {
      patch_columns[[patch]][[ind]] <- df[[col_name]]  # Storing the spectral data
    }
  }
}
  • The data was organized into a nested structure to facilitate analysis by both patch type and individual.
  • This organization allows us to easily compare:
    1. The same patch type across different individuals (within-patch variation).
    2. Different patch types on the same individual (between-patch variation).
  • This structured approach allows systematic comparison across both dimensions (patch type and individual).
  • Without this organization, it would be difficult to separate biological variation from measurement variation.

Calculating mean spectra for each patch type

patch_means <- data.frame(wavelength = df$wavelength)  # Starting with wavelength column
for(patch in patch_types) 
{
  # Combining all measurements for this patch type into a matrix
  patch_data <- do.call(cbind, patch_columns[[patch]])
  
  # Calculating row means (mean reflectance at each wavelength)
  patch_means[[patch]] <- rowMeans(patch_data, na.rm = TRUE)
}

PLOT 1: Mean spectra for each patch type

# Defining colors for plotting that match the patch types
colors <- c("black", "gray", "orange", "blue")

# Starting with the first patch (black)
plot(df$wavelength, patch_means$black, 
     type="l",                                       # Line plot
     col=colors[1],                                  # black color
     lwd=2,                                          
     ylim=c(0, max(patch_means[,-1], na.rm=TRUE) * 1.1),  # Y-axis range with 10% margin
     xlab="Wavelength (nm)",                         
     ylab="Reflectance (%)",                         
     main="Mean Spectra by Patch Type")              

# Adding lines for the remaining patch types
for(i in 2:length(patch_types)) 
{
  lines(df$wavelength,                     # X values (wavelength)
        patch_means[[patch_types[i]]],     # Y values (reflectance)
        col=colors[i],                     # Color corresponding to patch type
        lwd=2)                             
}

# Legend to identify each line
legend("topright",                         
       patch_types,                        # Labels for legend
       col=colors,                         # Colors for legend
       lwd=2)   

The image shows:

  • The mean spectral reflectance curves for each patch type (black, white, orange, blue) across the visible spectrum (400-700nm).

  • Each line represents the average reflectance pattern for a specific patch type.

What it tells us:

  • There are clear, distinctive spectral signatures for each patch type.

  • White patches have the highest overall reflectance (70-90%), as expected.

  • Orange patches show moderate reflectance (20-45%) with increasing reflectance at longer wavelengths (characteristic of orange coloration).

  • Blue patches show specific reflectance patterns with a slight peak around 550-580nm.

  • Black patches have the lowest reflectance overall (below 15%).

Importance:

  • This confirms that different patch types have fundamentally different spectral signatures.

  • These distinctive signatures form the basis for using hyperspectral analysis to identify and differentiate patches.

Calculating spectral angles between different patch types

# This quantifies how different each patch type is from the others
cat("Spectral angles between patch types (in radians):\n")
## Spectral angles between patch types (in radians):
# Creating a matrix to store the angles between each pair of patch types
patch_angles <- matrix(0,                          # Initializing with zeros
                       nrow=length(patch_types),   # Rows = number of patch types
                       ncol=length(patch_types))   # Columns = number of patch types

# Adding row and column names to the matrix for clarity
rownames(patch_angles) <- patch_types
colnames(patch_angles) <- patch_types

# Calculating the spectral angle between each pair of patch types
for(i in 1:length(patch_types)) 
{
  for(j in 1:length(patch_types)) 
  {
    if(i != j) # Skipping scenario comparing a patch with itself (angle would be 0)
    {  
      # Calculating spectral angle between mean spectra of two patch types
      patch_angles[i,j] <- calc_spectral_angle(
        patch_means[[patch_types[i]]],     # Mean spectrum of first patch type
        patch_means[[patch_types[j]]]      # Mean spectrum of second patch type
      )
    }
  }
}

# Printing the matrix of angles (rounded to 4 decimal places)
print(round(patch_angles, 4))
##         black  white orange   blue
## black  0.0000 0.2244 0.1224 0.1802
## white  0.2244 0.0000 0.2131 0.1979
## orange 0.1224 0.2131 0.0000 0.1065
## blue   0.1802 0.1979 0.1065 0.0000

The figure above shows:

  • A matrix of spectral angles between the mean spectra of different patch types.

  • Values range from 0 (identical) to higher values (more different).

What this tells us:

  • Black and orange patches are most similar to each other (angle = 0.1224).

  • White and orange patches are most different (angle = 0.2131).

  • Blue patches are most similar to orange patches (angle = 0.1065).

  • White patches are consistently the most different from other patch types.

Importance:

  • This quantifies exactly how different each patch type is from the others.

  • Shows which patches might be most easily confused with each other.

  • Provides a numerical basis for distinguishing patches in classification tasks.

Calculating within-patch variation

# This measures how much variation exists within the same patch type across different individuals
within_variation <- list()
for(patch in patch_types) 
{
  within_variation[[patch]] <- c()  # Initializing empty vector for this patch
  
  # Comparing all possible pairs of individuals for this patch type
  for(i in 1:(length(individuals)-1)) 
  {
    for(j in (i+1):length(individuals)) 
    {
      ind1 <- individuals[i]
      ind2 <- individuals[j]
      
      # Checking if we have data for both individuals for this patch
      if(!is.null(patch_columns[[patch]][[ind1]]) && !is.null(patch_columns[[patch]][[ind2]])) 
      {
        # Calculating spectral angle between the same patch type on different individuals
        angle <- calc_spectral_angle(
          patch_columns[[patch]][[ind1]],   # Patch on first individual
          patch_columns[[patch]][[ind2]]    # Same patch on second individual
        )
        
        # Storing the angle in our list
        within_variation[[patch]] <- c(within_variation[[patch]], angle)
      }
    }
  }
}

Calculating between-patch variation

# This measures how different patch types are within the same individual
# (For example, how different is black from blue on individual 01)
between_variation <- list()
for(ind in individuals) 
{
  between_variation[[ind]] <- c()  # Initializing empty vector for this individual
  
  # Comparing all possible pairs of patch types for this individual
  for(i in 1:(length(patch_types)-1)) 
  {
    for(j in (i+1):length(patch_types)) 
    {
      patch1 <- patch_types[i]
      patch2 <- patch_types[j]
      
      # Checking if we have data for both patch types for this individual
      if(!is.null(patch_columns[[patch1]][[ind]]) && !is.null(patch_columns[[patch2]][[ind]])) 
      {
        # Calculating spectral angle between different patch types on the same individual
        angle <- calc_spectral_angle(
          patch_columns[[patch1]][[ind]],   # First patch type on this individual
          patch_columns[[patch2]][[ind]]    # Second patch type on this individual
        )
        
        # Storing the angle in our list
        between_variation[[ind]] <- c(between_variation[[ind]], angle)
      }
    }
  }
}

PLOT 2: Boxplot of variation within vs. between patches

# Preparing data for boxplot by flattening the lists
within_all <- unlist(within_variation)    # Combining all within-patch angles
between_all <- unlist(between_variation)  # Combining all between-patch angles

# Creating a data frame for plotting
variation_data <- data.frame(
  Variation = c(rep("Within Patch", length(within_all)),     # Labels for within-patch data
                rep("Between Patches", length(between_all))),# Labels for between-patch data
  Angle = c(within_all, between_all)                         # All angle values
)

# Creating the boxplot
boxplot(Angle ~ Variation,               # Formula: Angle grouped by Variation type
        data=variation_data,             # Data to plot
        main="Variation Within vs. Between Patches",  
        ylab="Spectral Angle (radians)",  
        col=c("lightblue", "lightgreen"))  

Within-patch variation is significantly lower than between-patch variation.

Calculating summary statistics for variation

# Printing summary stats for within-patch variation
cat("\nSummary of within-patch variation:\n")
## 
## Summary of within-patch variation:
within_summary <- summary(within_all)
print(within_summary)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.01435 0.03265 0.05766 0.06285 0.06473 0.21041
cat("Standard deviation:", sd(within_all), "\n")
## Standard deviation: 0.04891543
# Printing summary stats for between-patch variation
cat("\nSummary of between-patch variation:\n")
## 
## Summary of between-patch variation:
between_summary <- summary(between_all)
print(between_summary)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.1053  0.1417  0.2046  0.1832  0.2188  0.2606
cat("Standard deviation:", sd(between_all), "\n")
## Standard deviation: 0.04737195
# =============================================================================
# Statistical test: Are within-patch and between-patch variations different?
# =============================================================================
# Performing t-test to check if the difference is statistically significant
t_result <- t.test(within_all, between_all)
cat("\nT-test comparing within-patch vs. between-patch variation:\n")
## 
## T-test comparing within-patch vs. between-patch variation:
print(t_result)
## 
##  Welch Two Sample t-test
## 
## data:  within_all and between_all
## t = -8.6586, df = 45.953, p-value = 3.252e-11
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.14833038 -0.09237165
## sample estimates:
##  mean of x  mean of y 
## 0.06285286 0.18320387

The figure above shows:

  • Results of a t-test comparing within-patch and between-patch angles.

  • Extremely significant p-value (3.252e-11) confirming the difference.

  • Mean values: within-patch = 0.0629, between-patch = 0.1832.

What this tells us:

  • Within-patch variation is significantly lower than between-patch variation.

  • The 95% confidence interval for the difference is between 0.092 and 0.148 radians.

  • The difference is not due to chance (p << 0.001).

PLOT 3: Spectral angles heatmap

# Creating a matrix of all pairwise spectral angle comparisons
# First, getting all column names that contain patch measurements
all_columns <- c()
for(patch in patch_types) 
{
  for(ind in individuals) 
  {
    col_name <- paste0(patch, "_", ind)  # Constructing column name (e.g., "black_01")
    if(col_name %in% names(df)) 
    {
      all_columns <- c(all_columns, col_name)  # Adding to our list if it exists
    }
  }
}

# Creating matrix to store all pairwise spectral angles
angle_matrix <- matrix(0,                         # Initializing with zeros
                       nrow=length(all_columns),  # Rows = number of columns
                       ncol=length(all_columns))  # Columns = number of columns

# Adding row and column names for clarity
rownames(angle_matrix) <- all_columns
colnames(angle_matrix) <- all_columns

# Calculating spectral angle between each pair of measurements
for(i in 1:length(all_columns)) 
{
  for(j in 1:length(all_columns)) 
  {
    if(i != j) # Skipping scenario comparing a column with itself (angle would be 0)
    {  
      # Calculating spectral angle between two measurements
      angle_matrix[i,j] <- calc_spectral_angle(
        df[[all_columns[i]]],   # First measurement
        df[[all_columns[j]]]    # Second measurement
      )
    }
  }
}

# Creating a heatmap visualization of all pairwise spectral angles
# This shows clusters of similar measurements
heatmap(angle_matrix, 
        main="Spectral Angles Between All Measurements",  
        xlab="Measurement",   
        ylab="Measurement",   
        col=heat.colors(100)) # Color palette (100 shades)

The image above shows:

  • Hierarchical clustering and heatmap of spectral angles between all individual measurements.

  • Darker colors indicate higher similarity (lower spectral angle).

  • The tree diagram shows hierarchical relationships based on spectral similarity.

What this tells us:

  • Measurements cluster primarily by patch type, not by individual.

  • All four orange patches cluster together, as do all black patches.

  • Blue patches mostly cluster together (blue_01 is an outlier).

  • White patches form a distinct cluster.

Importance:

  • Confirms that patch type is the primary determinant of spectral signature.

  • Shows that the spectral angle method successfully groups similar patches.

  • Identifies potential anomalies (like blue_01) that may need further investigation.

  • The hierarchical clustering provides an unbiased grouping method.

PLOT 4: Individual variation by patch type

# Setting up a 2x2 grid for plotting the 4 patch types
par(mfrow=c(2,2))  # 2 rows, 2 columns of plots

# Creating one plot for each patch type
for(patch in patch_types) 
{
  # Creating an empty plot with appropriate limits
  plot(df$wavelength, df[[paste0(patch, "_01")]], 
       type="n",  # No points or lines initially (will add them in a sec)
       # Calculating appropriate y-axis limits
       ylim=c(0, max(sapply(individuals, function(ind) 
       {
         col <- paste0(patch, "_", ind)
         if(col %in% names(df)) max(df[[col]], na.rm=TRUE) else 0
       }) * 1.1)),  # Adding 10% margin
       xlab="Wavelength (nm)",   
       ylab="Reflectance (%)",  
       main=paste("Variation in", patch, "patch"))  
  
  # Adding lines for each individual's spectrum for this patch type
  for(ind in individuals) 
  {
    col_name <- paste0(patch, "_", ind)  # Constructing column name
    if(col_name %in% names(df)) 
    {
      # Plotting this individual's spectrum with a unique color
      lines(df$wavelength, df[[col_name]], 
            col=rainbow(length(individuals))[match(ind, individuals)])
    }
  }
  
  # Adding the mean spectrum as a thick black line
  lines(df$wavelength, patch_means[[patch]], col="black", lwd=2)
  
  # Legend to identify each line
  legend("topright",                                 
         c(individuals, "Mean"),                     
         col=c(rainbow(length(individuals)), "black"),  
         lwd=c(rep(1, length(individuals)), 2),     
         cex=0.7)                                   
}

# Resetting the plotting layout to default (1x1)
par(mfrow=c(1,1))

The image above shows:

  • Four panels showing individual variation within each patch type.

  • Each panel shows spectral curves for each repeat (individual) and the mean.

What this tells us:

  • Black patches: Relatively consistent across individuals with slight variation around 600nm.

  • White patches: Individual 01 shows lower reflectance than others; 02-04 are very similar.

  • Orange patches: Individual 01 shows higher reflectance; Individual 03 shows lower reflectance.

  • Blue patches: Most variable patch type; Individual 03 shows distinctive peak around 580nm.

Importance:

  • Reveals detailed patterns of individual variation within each patch type.

  • Shows which patch types are most consistent or variable across individuals.

  • Helps identify potentially unusual individuals (e.g., Individual 01 for white patches).

Summary statistics by patch type and individual

# Calculating and displaying the average spectral angle within each patch type
cat("\nAverage spectral angle within each patch type:\n")
## 
## Average spectral angle within each patch type:
for(patch in patch_types) 
{
  if(length(within_variation[[patch]]) > 0) 
  {
    # If we have data, calculating and printing the mean
    cat(patch, ":", mean(within_variation[[patch]]), "\n")
  } 
  else 
  {
    # If no data is available, printing a message
    cat(patch, ": No data available\n")
  }
}
## black : 0.05614816 
## white : 0.0333581 
## orange : 0.04366843 
## blue : 0.1182368
# Calculating and displaying the average spectral angle between patches for each individual
cat("\nAverage spectral angle between patches for each individual:\n")
## 
## Average spectral angle between patches for each individual:
for(ind in individuals) 
{
  if(length(between_variation[[ind]]) > 0) 
  {
    # If we have data, calculating and printing the mean
    cat("Individual", ind, ":", mean(between_variation[[ind]]), "\n")
  } 
  else 
  {
    # If no data is available, printing a message
    cat("Individual", ind, ": No data available\n")
  }
}
## Individual 01 : 0.1779697 
## Individual 02 : 0.1813528 
## Individual 03 : 0.1850985 
## Individual 04 : 0.1883944

Lab Meeting Notes 6th March 2025

3: Patch-Based Statistical Analysis Using Boxplots and Nested ANOVA

Images of Museum Specimens

All 11 specimens.
All 11 specimens.
5 specimens (lettered A through E) preserved in 1969 and Pablo (bottom right) preserved in 2024.
5 specimens (lettered A through E) preserved in 1969 and Pablo (bottom right) preserved in 2024.
5 specimens (lettered A through E) preserved in 1938.
5 specimens (lettered A through E) preserved in 1938.

Imaging Procedure

  • Imaging was conducted under controlled lighting conditions.

  • To minimize the effect of camera heating on reflectance measurements, we randomized the order in which the 11 fish were imaged at the start of each session.

  • For each session, all 11 fish were imaged once.

  • After completing the first round of imaging, the hyperspectral camera was turned off and allowed to cool down for 3 minutes.

  • After the cooldown period, all 11 fish were imaged again, following the same procedure.

  • This process was repeated across a total of 5 imaging sessions, resulting in 5 repeat measurements for each fish.